Skip to content

feat: New dataset builder - DoclingSDGDatasetBuilder#205

Merged
maxmnemonic merged 3 commits intomainfrom
dev/doclingsdg_builder
Mar 30, 2026
Merged

feat: New dataset builder - DoclingSDGDatasetBuilder#205
maxmnemonic merged 3 commits intomainfrom
dev/doclingsdg_builder

Conversation

@maxmnemonic
Copy link
Copy Markdown
Member

@maxmnemonic maxmnemonic commented Mar 27, 2026

  • New dataset builder - DoclingSDGDatasetBuilder
  • Helper utility that splits data (pairs of docling json + images) into train/test/val splits with given proportions
  • Safer handling of TableFormerPredictionProvider

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
@maxmnemonic maxmnemonic self-assigned this Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 27, 2026

DCO Check Passed

Thanks @maxmnemonic, all your commits are properly signed off. 🎉

@mergify
Copy link
Copy Markdown

mergify Bot commented Mar 27, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@maxmnemonic maxmnemonic changed the title feat: New dataset builder - DoclingSDGDatasetBuilder WIP: New dataset builder - DoclingSDGDatasetBuilder Mar 30, 2026
Maksym Lysak added 2 commits March 30, 2026 16:27
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
…ing json + images) into train/test/val splits

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
@maxmnemonic maxmnemonic changed the title WIP: New dataset builder - DoclingSDGDatasetBuilder feat: New dataset builder - DoclingSDGDatasetBuilder Mar 30, 2026
@maxmnemonic maxmnemonic requested review from cau-git March 30, 2026 14:35
@maxmnemonic maxmnemonic merged commit e761bcc into main Mar 30, 2026
10 checks passed
@maxmnemonic maxmnemonic deleted the dev/doclingsdg_builder branch March 30, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants